Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
1.
Mol Inform ; : e202300263, 2024 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-38386182

RESUMEN

Increasing antimicrobial resistance (AMR) represents a global healthcare threat. To decrease the spread of AMR and associated mortality, methods for rapid selection of optimal antibiotic treatment are urgently needed. Machine learning (ML) models based on genomic data to predict resistant phenotypes can serve as a fast screening tool prior to phenotypic testing. Nonetheless, many existing ML methods lack interpretability. Therefore, we present a methodology for visualization of sequence space and AMR prediction based on the non-linear dimensionality reduction method - generative topographic mapping (GTM). This approach, applied to AMR data of >5000 S. aureus isolates retrieved from the PATRIC database, yielded GTM models with reasonable accuracy for all drugs (balanced accuracy values ≥0.75). The Generative Topographic Maps (GTMs) represent data in the form of illustrative maps of the genomic space and allow for antibiotic-wise comparison of resistant phenotypes. The maps were also found to be useful for the analysis of genetic determinants responsible for drug resistance. Overall, the GTM-based methodology is a useful tool for both the illustrative exploration of the genomic sequence space and AMR prediction.

2.
J Chem Inf Model ; 63(17): 5571-5582, 2023 09 11.
Artículo en Inglés | MEDLINE | ID: mdl-37602843

RESUMEN

In chemical library analysis, it may be useful to describe libraries as individual items rather than collections of compounds. This is particularly true for ultra-large noncherry-pickable compound mixtures, such as DNA-encoded libraries (DELs). In this sense, the chemical library space (CLS) is useful for the management of a portfolio of libraries, just like chemical space (CS) helps manage a portfolio of molecules. Several possible CLSs were previously defined using vectorial library representations obtained from generative topographic mapping (GTM). Given the steadily growing number of DEL designs, the CLS becomes "crowded" and requires analysis tools beyond pairwise library comparison. Therefore, herein, we investigate the cartography of CLS on meta-(µ)GTMs─"meta" to remind that these are maps of the CLS, itself based on responsibility vectors issued by regular CS GTMs. 2,5 K DELs and ChEMBL (reference) were projected on the µGTM, producing landscapes of library-specific properties. These describe both interlibrary similarity and intrinsic library characteristics in the same view, herewith facilitating the selection of the best project-specific libraries.


Asunto(s)
Bibliotecas de Moléculas Pequeñas , Biblioteca de Genes
3.
J Chem Inf Model ; 63(16): 5107-5119, 2023 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-37556857

RESUMEN

This study introduces a new de novo design algorithm called GENERA that combines the capabilities of a deep-learning algorithm for automated drug-like analogue design, called DeLA-Drug, with a genetic algorithm for generating molecules with desired target-oriented properties. Specifically, GENERA was applied to the angiotensin-converting enzyme 2 (ACE2) target, which is implicated in many pathological conditions, including COVID-19. The ability of GENERA to de novo design promising candidates for a specific target was assessed using two docking programs, PLANTS and GLIDE. A fitness function based on the Pareto dominance resulting from computed PLANTS and GLIDE scores was applied to demonstrate the algorithm's ability to perform multiobjective optimizations effectively. GENERA can quickly generate focused libraries that produce better scores compared to a starting set of known ACE-2 binders. This study is the first to utilize a DL-based algorithm designed for analogue generation as a mutational operator within a GA framework, representing an innovative approach to target-oriented de novo design.


Asunto(s)
COVID-19 , Aprendizaje Profundo , Humanos , Algoritmos , Diseño de Fármacos
4.
J Chem Inf Model ; 63(13): 4042-4055, 2023 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-37368824

RESUMEN

The development of DNA-encoded library (DEL) technology introduced new challenges for the analysis of chemical libraries. It is often useful to consider a chemical library as a stand-alone chemoinformatic object─represented both as a collection of independent molecules, and yet an individual entity─in particular, when they are inseparable mixtures, like DELs. Herein, we introduce the concept of chemical library space (CLS), in which resident items are individual chemical libraries. We define and compare four vectorial library representations obtained using generative topographic mapping. These allow for an effective comparison of libraries, with the ability to tune and chemically interpret the similarity relationships. In particular, property-tuned CLS encodings enable to simultaneously compare libraries with respect to both property and chemotype distributions. We apply the various CLS encodings for the selection problem of DELs that optimally "match" a reference collection (here ChEMBL28), showing how the choice of the CLS descriptors may help to fine-tune the "matching" (overlap) criteria. Hence, the proposed CLS may represent a new efficient way for polyvalent analysis of thousands of chemical libraries. Selection of an easily accessible compound collection for drug discovery, as a substitute for a difficult to produce reference library, can be tuned for either primary or target-focused screening, also considering property distributions of compounds. Alternatively, selection of libraries covering novel regions of the chemical space with respect to a reference compound subspace may serve for library portfolio enrichment.


Asunto(s)
ADN , Bibliotecas de Moléculas Pequeñas , Bibliotecas de Moléculas Pequeñas/química , ADN/química , Biblioteca de Genes , Descubrimiento de Drogas/métodos
5.
Biomolecules ; 13(2)2023 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-36830654

RESUMEN

Microtubules are highly dynamic polymers of α,ß-tubulin dimers which play an essential role in numerous cellular processes such as cell proliferation and intracellular transport, making them an attractive target for cancer and neurodegeneration research. To date, a large number of known tubulin binders were derived from natural products, while only one was developed by rational structure-based drug design. Several of these tubulin binders show promising in vitro profiles while presenting unacceptable off-target effects when tested in patients. Therefore, there is a continuing demand for the discovery of safer and more efficient tubulin-targeting agents. Since tubulin structural data is readily available, the employment of computer-aided design techniques can be a key element to focus on the relevant chemical space and guide the design process. Due to the high diversity and quantity of structural data available, we compiled here a guide to the accessible tubulin-ligand structures. Furthermore, we review different ligand and structure-based methods recently used for the successful selection and design of new tubulin-targeting agents.


Asunto(s)
Antineoplásicos , Neoplasias , Humanos , Tubulina (Proteína) , Ligandos , Antineoplásicos/química , Microtúbulos , Neoplasias/tratamiento farmacológico
6.
Chemistry ; 29(5): e202300069, 2023 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-36692211

RESUMEN

Invited for the cover of this issue are the groups of Professors Passarella and Pieraccini at the University of Milan, in collaboration with some of the members of TubInTrain consortium. The image depicts work with the elements of nature, in particular the destabilising effect of maytansinol (the constellation) on microtubules (the trees). Read the full text of the article at 10.1002/chem.202203431.


Asunto(s)
Maitansina , Microtúbulos , Investigación , Grupo Social
7.
Mol Inform ; 42(4): e2200208, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36604304

RESUMEN

In order to analyze the Chimiothèque Nationale (CN) - The French National Compound Library - in the context of screening and biologically relevant compounds, the library was compared with ZINC in-stock collection and ChEMBL. This includes the study of chemical space coverage, physicochemical properties and Bemis-Murcko (BM) scaffold populations. More than 5 K CN-unique scaffolds (relative to ZINC and ChEMBL collections) were identified. Generative Topographic Maps (GTMs) accommodating those libraries were generated and used to compare the compound populations. Hierarchical GTM («zooming¼) was applied to generate an ensemble of maps at various resolution levels, from global overview to precise mapping of individual structures. The respective maps were added to the ChemSpace Atlas website. The analysis of synthetic accessibility in the context of combinatorial chemistry showed that only 29,7 % of CN compounds can be fully synthesized using commercially available building blocks.


Asunto(s)
Bases de Datos de Compuestos Químicos
8.
Chemistry ; 29(5): e202203431, 2023 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-36468686

RESUMEN

Maytansinoids are a successful class of natural and semisynthetic tubulin binders, known for their potent cytotoxic activity. Their wider application as cytotoxins and chemical probes to study tubulin dynamics has been held back by the complexity of natural product chemistry. Here we report the synthesis of long-chain derivatives and maytansinoid conjugates. We confirmed that bulky substituents do not impact their high activity or the scaffold's binding mode. These encouraging results open new avenues for the design of new maytansine-based probes.


Asunto(s)
Antineoplásicos , Maitansina , Tubulina (Proteína)/metabolismo , Antineoplásicos/metabolismo , Microtúbulos
9.
J Chem Inf Model ; 62(22): 5471-5484, 2022 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-36332178

RESUMEN

In order to better foramize it, the notorious inverse-QSAR problem (finding structures of given QSAR-predicted properties) is considered in this paper as a two-step process including (i) finding "seed" descriptor vectors corresponding to user-constrained QSAR model output values and (ii) identifying the chemical structures best matching the "seed" vectors. The main development effort here was focused on the latter stage, proposing a new attention-based conditional variational autoencoder neural-network architecture based on recent developments in attention-based methods. The obtained results show that this workflow was capable of generating compounds predicted to display desired activity while being completely novel compared to the training database (ChEMBL). Moreover, the generated compounds show acceptable druglikeness and synthetic accessibility. Both pharmacophore and docking studies were carried out as "orthogonal" in silico validation methods, proving that some of de novo structures are, beyond being predicted active by 2D-QSAR models, clearly able to match binding 3D pharmacophores and bind the protein pocket.


Asunto(s)
Relación Estructura-Actividad Cuantitativa , Simulación del Acoplamiento Molecular
10.
Molecules ; 27(17)2022 Aug 24.
Artículo en Inglés | MEDLINE | ID: mdl-36080168

RESUMEN

New models for ACE2 receptor binding, based on QSAR and docking algorithms were developed, using XRD structural data and ChEMBL 26 database hits as training sets. The selectivity of the potential ACE2-binding ligands towards Neprilysin (NEP) and ACE was evaluated. The Enamine screening collection (3.2 million compounds) was virtually screened according to the above models, in order to find possible ACE2-chemical probes, useful for the study of SARS-CoV2-induced neurological disorders. An enzymology inhibition assay for ACE2 was optimized, and the combined diversified set of predicted selective ACE2-binding molecules from QSAR modeling, docking, and ultrafast docking was screened in vitro. The in vitro hits included two novel chemotypes suitable for further optimization.


Asunto(s)
Enzima Convertidora de Angiotensina 2 , COVID-19 , Humanos , Simulación del Acoplamiento Molecular , Peptidil-Dipeptidasa A/metabolismo , ARN Viral , SARS-CoV-2
11.
J Chem Inf Model ; 62(18): 4537-4548, 2022 09 26.
Artículo en Inglés | MEDLINE | ID: mdl-36103300

RESUMEN

Nowadays, drug discovery is inevitably intertwined with the usage of large compound collections. Understanding of their chemotype composition and physicochemical property profiles is of the highest importance for successful hit identification. Efficient polyfunctional tools allowing multifaceted analysis of constantly growing chemical libraries must be Big Data-compatible. Here, we present the freely accessible ChemSpace Atlas (https://chematlas.chimie.unistra.fr), which includes almost 40K hierarchically organized Generative Topographic Maps (GTM) accommodating up to 500 M compounds covering fragment-like, lead-like, drug-like, PPI-like, and NP-like chemical subspaces. They allow users to navigate and analyze ZINC, ChEMBL, and COCONUT from multiple perspectives on different scales: from a bird's eye view of the entire library to structural pattern detection in small clusters. Around 20 physicochemical properties and almost 750 biological activities can be visualized (associated with map zones), supporting activity profiling and analogue search. Moreover, ChemScape Atlas will be extended toward new chemical subspaces (e.g., DNA-encoded libraries and synthons) and functionalities (ADMETox profiling and property-guided de novo compound generation).


Asunto(s)
Descubrimiento de Drogas , Bibliotecas de Moléculas Pequeñas , ADN/química , Biblioteca de Genes , Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/farmacología , Zinc
12.
ACS Cent Sci ; 8(6): 804-813, 2022 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-35756377

RESUMEN

Dynamic combinatorial libraries (DCLs) display adaptive behavior, enabled by the reversible generation of their molecular constituents from building blocks, in response to external effectors, e.g., protein receptors. So far, chemoinformatics has not yet been used for the design of DCLs-which comprise a radically different set of challenges compared to classical library design. Here, we propose a chemoinformatic model for theoretically assessing the composition of DCLs in the presence and the absence of an effector. An imine-based DCL in interaction with the effector human carbonic anhydrase II (CA II) served as a case study. Support vector regression models for the imine formation constants and imine-CA II binding were derived from, respectively, a set of 276 imines synthesized and experimentally studied in this work and 4350 inhibitors of CA II from ChEMBL. These models predict constants for all DCL constituents, to feed software assessing equilibrium concentrations. They are publicly available on the dedicated website. Models rationally selected two amines and two aldehydes predicted to yield stable imines with high affinity for CA II and provided a virtual illustration on how effector affinity regulates DCL members.

13.
Bioinformatics ; 38(8): 2307-2314, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35157024

RESUMEN

MOTIVATION: Human immunodeficiency virus (HIV) drug resistance is a global healthcare issue. The emergence of drug resistance influenced the efficacy of treatment regimens, thus stressing the importance of treatment adaptation. Computational methods predicting the drug resistance profile from genomic data of HIV isolates are advantageous for monitoring drug resistance in patients. However, existing computational methods for drug resistance prediction are either not suitable for emerging HIV strains with complex mutational patterns or lack interpretability, which is of paramount importance in clinical practice. The approach reported here overcomes these limitations and combines high accuracy of predictions and interpretability of the models. RESULTS: In this work, a new methodology based on generative topographic mapping (GTM) for biological sequence space representation and quantitative genotype-phenotype relationships prediction purposes was introduced. The GTM-based resistance landscapes allowed us to predict the resistance of HIV strains based on sequencing and drug resistance data for three viral proteins [integrase (IN), protease (PR) and reverse transcriptase (RT)] from Stanford HIV drug resistance database. The average balanced accuracy for PR inhibitors was 0.89 ± 0.01, for IN inhibitors 0.85 ± 0.01, for non-nucleoside RT inhibitors 0.73 ± 0.01 and for nucleoside RT inhibitors 0.84 ± 0.01. We have demonstrated in several case studies that GTM-based resistance landscapes are useful for visualization and analysis of sequence space as well as for treatment optimization purposes. Here, GTMs were applied for the in-depth analysis of the relationships between mutation pattern and drug resistance using mutation landscapes. This allowed us to predict retrospectively the importance of the presence of particular mutations (e.g. V32I, L10F and L33F in HIV PR) for the resistance development. This study highlights some perspectives of GTM applications in clinical informatics and particularly in the field of sequence space exploration. AVAILABILITY AND IMPLEMENTATION: https://github.com/karinapikalyova/ISIDASeq. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Infecciones por VIH , VIH-1 , Humanos , VIH-1/genética , VIH-1/metabolismo , Secuencia de Aminoácidos , Infecciones por VIH/tratamiento farmacológico , Estudios Retrospectivos , Transcriptasa Inversa del VIH/química , Transcriptasa Inversa del VIH/genética , Transcriptasa Inversa del VIH/metabolismo , Mutación , Proteasa del VIH/genética , Proteasa del VIH/metabolismo , Resistencia a Medicamentos , Farmacorresistencia Viral/genética , Genotipo
14.
Mol Inform ; 41(6): e2100289, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-34981643

RESUMEN

DNA-Encoded Library (DEL) technology has emerged as an alternative method for bioactive molecules discovery in medicinal chemistry. It enables the simple synthesis and screening of compound libraries of enormous size. Even though it gains more and more popularity each day, there are almost no reports of chemoinformatics analysis of DEL chemical space. Therefore, in this project, we aimed to generate and analyze the ultra-large chemical space of DEL. Around 2500 DELs were designed using commercially available building blocks resulting in 2,5B DEL compounds that were compared to biologically relevant compounds from ChEMBL using Generative Topographic Mapping. This allowed to choose several optimal DELs covering the chemical space of ChEMBL to the highest extent and thus containing the maximum possible percentage of biologically relevant chemotypes. Different combinations of DELs were also analyzed to identify a set of mutually complementary libraries allowing to attain even higher coverage of ChEMBL than it is possible with one single DEL.


Asunto(s)
Descubrimiento de Drogas , Bibliotecas de Moléculas Pequeñas , Quimioinformática , Química Farmacéutica , ADN/química , Descubrimiento de Drogas/métodos , Bibliotecas de Moléculas Pequeñas/química
15.
J Chem Inf Model ; 62(9): 2151-2163, 2022 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-34723532

RESUMEN

Most of the existing computational tools for de novo library design are focused on the generation, rational selection, and combination of promising structural motifs to form members of the new library. However, the absence of a direct link between the chemical space of the retrosynthetically generated fragments and the pool of available reagents makes such approaches appear as rather theoretical and reality-disconnected. In this context, here we present Synthons Interpreter (SynthI), a new open-source toolkit for de novo library design that allows merging those two chemical spaces into a single synthons space. Here synthons are defined as actual fragments with valid valences and special labels, specifying the position and the nature of reactive centers. They can be issued from either the "breakup" of reference compounds according to 38 retrosynthetic rules or real reagents, after leaving group withdrawal or transformation. Such an approach not only enables the design of synthetically accessible libraries and analog generation but also facilitates reagents (building blocks) analysis in the medicinal chemistry context. SynthI code is publicly available at https://github.com/Laboratoire-de-Chemoinformatique/SynthI.


Asunto(s)
Indicadores y Reactivos
16.
J Chem Inf Model ; 62(9): 2171-2185, 2022 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-34928600

RESUMEN

The ability to efficiently synthesize desired compounds can be a limiting factor for chemical space exploration in drug discovery. This ability is conditioned not only by the existence of well-studied synthetic protocols but also by the availability of corresponding reagents, so-called building blocks (BBs). In this work, we present a detailed analysis of the chemical space of 400 000 purchasable BBs. The chemical space was defined by corresponding synthons─fragments contributed to the final molecules upon reaction. They allow an analysis of BB physicochemical properties and diversity, unbiased by the leaving and protective groups in actual reagents. The main classes of BBs were analyzed in terms of their availability, rule-of-two-defined quality, and diversity. Available BBs were eventually compared to a reference set of biologically relevant synthons derived from ChEMBL fragmentation, in order to illustrate how well they cover the actual medicinal chemistry needs. This was performed on a newly constructed universal generative topographic map of synthon chemical space that enables visualization of both libraries and analysis of their overlapped and library-specific regions.


Asunto(s)
Química Farmacéutica , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Indicadores y Reactivos
17.
Commun Chem ; 5(1): 37, 2022 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-36697737

RESUMEN

Carbon capture and storage technologies are projected to increasingly contribute to cleaner energy transitions by significantly reducing CO2 emissions from fossil fuel-driven power and industrial plants. The industry standard technology for CO2 capture is chemical absorption with aqueous alkanolamines, which are often being mixed with an activator, piperazine, to increase the overall CO2 absorption rate. Inefficiency of the process due to the parasitic energy required for thermal regeneration of the solvent drives the search for new tertiary amines with better kinetics. Improving the efficiency of experimental screening using computational tools is challenging due to the complex nature of chemical absorption. We have developed a novel computational approach that combines kinetic experiments, molecular simulations and machine learning for the in silico screening of hundreds of prospective candidates and identify a class of tertiary amines that absorbs CO2 faster than a typical commercial solvent when mixed with piperazine, which was confirmed experimentally.

18.
Environ Sci Technol ; 55(22): 15542-15553, 2021 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-34736317

RESUMEN

The removal of CO2 from gases is an important industrial process in the transition to a low-carbon economy. The use of selective physical (co-)solvents is especially perspective in cases when the amount of CO2 is large as it enables one to lower the energy requirements for solvent regeneration. However, only a few physical solvents have found industrial application and the design of new ones can pave the way to more efficient gas treatment techniques. Experimental screening of gas solubility is a labor-intensive process, and solubility modeling is a viable strategy to reduce the number of solvents subject to experimental measurements. In this paper, a chemoinformatics-based modeling workflow was applied to build a predictive model for the solubility of CO2 and four other industrially important gases (CO, CH4, H2, and N2). A dataset containing solubilities of gases in 280 solvents was collected from literature sources and supplemented with the new data for six solvents measured in the present study. A modeling workflow based on the usage of several state-of-the-art machine learning algorithms was applied to establish quantitative structure-solubility relationships. The best models were used to perform virtual screening of the industrially produced chemicals. It enabled the identification of compounds with high predicted CO2 solubility and selectivity toward other gases. The prediction for one of the compounds, 4-methylmorpholine, was confirmed experimentally.


Asunto(s)
Dióxido de Carbono , Quimioinformática , Gases , Solubilidad , Solventes
19.
Mol Inform ; 40(9): e2100068, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34170632

RESUMEN

Natural products (NPs), being evolutionary selected over millions of years to bind to biological macromolecules, remained an important source of inspiration for medicinal chemists even after the advent of efficient drug discovery technologies such as combinatorial chemistry and high-throughput screening. Thus, there is a strong demand for efficient and user-friendly computational tools that allow to analyze large libraries of NPs. In this context, we introduce NP Navigator - a freely available intuitive online tool for visualization and navigation through the chemical space of NPs and NP-like molecules. It is based on the hierarchical ensemble of generative topographic maps, featuring NPs from the COlleCtion of Open NatUral producTs (COCONUT), bioactive compounds from ChEMBL and commercially available molecules from ZINC. NP Navigator allows to efficiently analyze different aspects of NPs - chemotype distribution, physicochemical properties, biological activity and commercial availability of NPs. The latter concerns not only purchasable NPs but also their close analogs that can be considered as synthetic mimetics of NPs or pseudo-NPs.


Asunto(s)
Productos Biológicos , Técnicas Químicas Combinatorias , Sustancias Macromoleculares/análisis , Zinc/química
20.
Sci Rep ; 11(1): 3178, 2021 02 04.
Artículo en Inglés | MEDLINE | ID: mdl-33542271

RESUMEN

The "creativity" of Artificial Intelligence (AI) in terms of generating de novo molecular structures opened a novel paradigm in compound design, weaknesses (stability & feasibility issues of such structures) notwithstanding. Here we show that "creative" AI may be as successfully taught to enumerate novel chemical reactions that are stoichiometrically coherent. Furthermore, when coupled to reaction space cartography, de novo reaction design may be focused on the desired reaction class. A sequence-to-sequence autoencoder with bidirectional Long Short-Term Memory layers was trained on on-purpose developed "SMILES/CGR" strings, encoding reactions of the USPTO database. The autoencoder latent space was visualized on a generative topographic map. Novel latent space points were sampled around a map area populated by Suzuki reactions and decoded to corresponding reactions. These can be critically analyzed by the expert, cleaned of irrelevant functional groups and eventually experimentally attempted, herewith enlarging the synthetic purpose of popular synthetic pathways.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...